Finite-State Parsing And Disambiguation
نویسنده
چکیده
A languageindependent method of finitestate surface syntactic parsing and word-disambiguation is discussed. Input sentences are represented as finite-state networks already containing all possible roles and interpretations of its units. Also syntactic constraint rules are represented as finite-state machines where each constraint excludes certain types of ungrammatical readings. The whole grammar is an intersection of its constraint rules and excludes all ungrammatical possibilities leaving the correct interpretation(s) of the sentence. The method is being tested for Finnish, Swedish and English.
منابع مشابه
A Cascaded Finite-State Parser for German
The paper presents two approaches to partial parsing of German: a tagger trained on dependency tuples, and a cascaded finite-state parser (Abney, 1997). For the tagging approach, the effects of choosing different representations of dependency tuples are investigated. Performance of the finite-state parser is boosted by delaying syntactically unsolvable disambiguation problems via underspecifica...
متن کاملSchematic Finite-State Intersection Parsing
The framework of Finite-State Intersection Grammars employs a parsing technique according to which several finite-state automata are intersected to determine the output automaton. Implementation of the intersection parser has turned out to be a difficult task. Several problems in efficiency arise when disambiguation choices are based on long contexts with many don’t cares. We are concerned with...
متن کاملImplementing Voting Constraints With Finite State Transducers
We describe a constraint-based morphological disambiguation system in which individual constraint rules vote on matching morphological parses followed by its implementation using finite state transducers. Voting constraint rules have a number of desirable properties: The outcome of the disambiguation is independent of the order of application of the local contextual constraint rules. Thus the r...
متن کاملRule-based Approach to Korean Morphological Disambiguation Supported by Statistical Method
Korean as an agglutinative language shows its proper types of difficulties in morphological disambiguation, since a large number of its ambiguities comes from the stemming while most of ambiguities in French or English are related to the categorization of a morpheme. The current Korean morphological disambiguation systems adopt mainly statistical methods and some of them use rules in the postpr...
متن کاملPartial Parsing of Spontaneous Spoken French
This paper describes the process and the resources used to automatically annotate a French corpus of spontaneous speech transcriptions in super-chunks. Super-chunks are enhanced chunks that can contain lexical multiword units. This partial parsing is based on a preprocessing stage of the spoken data that consists in reformatting and tagging utterances that break the syntactic structure of the t...
متن کامل